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Abstract 

We show that for any metric probability space (M, d, /i) with a subgaussian constant 
cr^(^) and any set A C M we have a^{fj,A) < clog(e/^(A)) where is a restriction 

of fi to the set A and c is a universal constant. As a consequence we deduce concentration 
inequalities for non-Lipschitz functions. 


1 Introduction 

It is known that many high-dimensional probability distributions /r on the Euclidean space 
K.” (and other metric spaces, including graphs) possess strong concentration properties. In 
a functional language, this may informally be stated as the assertion that any sufhciently 
smooth function / on R”, e.g., having a bounded Lipschitz semi-norm, is almost a constant on 
almost all space. There are several ways to quantify such a property. One natural approach 
proposed by N. Alon, R. Boppana and J. Spencer [A-B-S] associates with a given metric 
probability space (M, d, its spread constant, 

s^(/r) = sup Var^(/) = sup J if - dp, 

where m = J f dp, and the sup is taken over all functions f on M with ||/||Lip < 1- More 
information is contained in the so-called subgaussian constant = (j‘^{p) which is defined 
as the infimum over all cr^ such that 

J dp < for all t gR, (1-1) 

for any f on M with m = 0 and ||/||Lip < 1 (cf. [B-G-Hj i. This quantity may also be 
introduced via the transport-entropy inequality relating the classical Kantorovich distance 
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and the relative entropy from an arbitrary probability measure on M to the measure /i (cf. 

lEG]). 

While in general the latter characteristic allows one to control subgaussian tails 

under the probability measure fj, uniformly in the entire class of Lipschitz functions on M. 
More generally, when ||/||Lip < L, (11.11) yields 

fi{\f — m\> t} < 2e~* ^ ^ f > 0. (1.2) 

Classical and well-known examples include the standard Gaussian measure on M = K.” in 
which case = 1, and the normalized Lebesgue measure on the unit sphere M = 5'"“^ 

with = cr^ = The last example was a starting point in the study of the concentration 
of measure phenomena, a fruitful direction initiated in the early 1970s by V. D. Milman. 

Other examples come often after verification that /i satisfies certain Sobolev-type inequal¬ 
ities such as Poincare-type inequalities 

AiVar^(M) < J 

and logarithmic Sobolev inequalities 


pEnt;,('u^) = p 


log vr dp— / dp log / dp 


< 2 / |Vupd/x, 


where u may be any locally Lipschitz function on M, and the constants Ai > 0 and p > 0 do 
not depend on u. Here the modulus of the gradient may be understood in the generalized 
sense as the function 


|Vm(x)| = limsup 

y^x 


|n(a:) - u{y)\ 
d{x,y) 


X € M 


(this is the so-called “continuous setting”), while in the discrete spaces, e.g., graphs, we deal 
with other naturally defined gradients. In both cases, one has respectively the well-known 
upper bounds 

s^(p) < cr^(p) < -• (1-3) 

Ai p 

For example, Ai = p = n — 1 on the unit sphere (best possible values, [M-W] l. which can be 
used to make a corresponding statement about the spread and Gaussian constants. 

One of the purposes of this note is to give new examples by involving the family of the 
normalized restricted measures 

(B) = ^ ^ ^ ’ 

p{A) 

where a set A G M is fixed and has a positive measure. As an example, returning to the 
standard Gaussian measure p on R", it is known that a'^(pA) < 1 for any convex body A C 
R". This remarkable property, discovered by D. Bakry and M. Ledoux |B-Lj in a sharper form 
of a Gaussian-type isoperimetric inequality, has nowadays several proofs and generalizations, 
cf. |BT1IB2] . Of course, in general, the set A may have a rather disordered structure, for 
example, to be disconnected. And then there is no hope for validity of a Poincare-type 
inequality for the measure pA- Nevertheless, it turns out that the concentration property 
of Pa is inherited from p, unless the measure of A is too small. In particular, we have the 
following observation about abstract metric probability spaces. 

Theorem 1.1. For any measurable set A G M with p{A) > 0, the subgaussian constant 
a‘^{pA) of the normalized restricted measure satisfies 

cr^ipA) < c log (- 4 ^) cr2(p), (1.4) 

where c is an absolute constant. 
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One may further generalize this assertion by defining the subgaussian constant cr3r(/i) 
within a given fixed subclass T of functions on M, by using the same bound on the 
Laplace transform. This is motivated by a possible different level of concentration for dif¬ 
ferent classes; indeed, in case oi M = R", the concentration property may considerably be 
strengthened for the class J- of all convex Lipschitz functions. In particular, one result of 
M. Talagrand |T1[IT2] provides a dimension-free bound cr^(/r) < C for an arbitrary product 
probability measure on the n-dimensional cube [—1,1]". Hence, a more general version of 
Theorem 11.11 yields the bound 

< = i«g (^) 

with some absolute constant c, which holds for any Borel subset H of [—1,1]" (cf. Section | 6 ] 
below). 

According to the very definition, the quantities cr^(^) and a^{fj,A) might seem to be 
responsible for deviations of only Lipschitz functions f on M and A, respectively. However, 
the inequality may also be used to control deviations of non-Lipschitz / - on large parts 
of the space and under certain regularity hypotheses. Assume, for example, f |V/| d/i < 1 
(which is kind of a normalization condition) and consider 

A= {x € M : lVf(x)l < L}. (1.5) 

If L > 2, this set has the measure fJ-(A) > 1 — A > and hence, < ccr^(^) with 

some absolute constant c. If we assume that / has a Lipschitz semi-norm < L on A, then, 
according to (HH), 


e A : 1 /- m| > t} < t > 0 , ( 1 . 6 ) 

where m is the mean of / with respect to fj,A- It is in this sense one may say that / is almost 
a constant on the set A. 

This also yields a corresponding deviation bound on the whole space, 
fi{x€M :\f-m\>t}< 

1j 

Stronger integrability conditions posed on |V/| can considerably sharpen the conclusion. 
By a similar argument. Theorem 11.11 yields, for example, the following exponential bound, 
known in the presence of a logarithmic Sobolev inequality for the space (M, d, /i), and with 
(T^ replaced by 1/p (cf. |B-G) L 

Corollary 1.2. Let f be a locally Lipschitz function on M with Lipschitz semi-norms < L 
on the sets C31). df/el^/l d/i < 2 , then / is p-integrable, and moreover, 

Pl{x € M : \f — m\ > t) < t > 0, 

where m is the p,-mean of f and c is an absolute constant. 

Equivalently (up to an absolute factor), we have a Sobolev-type inequality 

||/-to||v,i < ccr(p) ||V/||^,, 

connecting the ^/i-norm of / — m with the '(/i 2 -norm of the modulus of the gradient of /. 
We prove a more general version of this corollary in Section [5] (cf. Theorem 16.1|) . As will 
be explained in the same section, similar assertions may also be made about convex / and 
product measures p on M = [—1,1]", thus extending Talagrand’s theorem to the class of 
non-Lipschitz functions. 
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In view of the right bound in (11.31) and dm, the spread and subgaussian constants for 
restricted measures can be controled in terms of the logarithmic Sobolev constant p via 

s^{pa) < “• 

However, it may happen that p = 0 and cr^{p) = oo, while Ai > 0 (e.g., for the product 
exponential distribution on K”). Then one may wonder whether one can estimate the spread 
constant of a restricted measure in terms of the spectral gap. In that case there is a bound 
similar to dm- 

Theorem 1.3. Assume the metric probability space {M,d,p) satisfies a Poincare-type in¬ 
equality with Ai > 0. For any A G M with p{A) > 0, with some absolute constant c 

< C log^ (^) (1^7) 

It should be mentioned that the logarithmic terms in dm and dm may not be removed 
and are actually asymptotically optimal as functions of p{A), as p{A) is getting small, see 
Section [T] 

Our contribution below is organized into sections as follows: 

2. Bounds on '0a-norms for restricted measures. 

3. Proof of Theorem ll.il Transport-entropy formulation. 

4. Proof of Theorem 11.31 Spectral gap. 

5. Examples. 

6. Deviations for non-Lipschitz functions. 

7. Optimality. 

8. Appendix. 


2 Bounds on i/^a-norms for restricted measures 

A measurable function / on the probability space (M, p) is said to have a finite 'i/iQ.-norm, 
Of > 1, if for some r > 0, 

j d/i < 2. 

The inhmum over all such r represents the ipa-noi'm ||/||i/,„ or ||/||L«/'c<(;i), which is just the 
Orlicz norm associated with the Young function ipa{t) = — 1. 

We are mostly interested in the particular cases a = 1 and a = 2. In this section we 
recall well-known relations between the ipi and '02-norms and the usual L^’-norms ||/||p = 
ll/llLp(7t) = {J \f\^ . For the readers’convenience, we include the proof in the appendix. 

Lemma 2.1. We have 


sup^<||/|L..(^)<4sup^, 
P>1 VP p >1 y/P 


( 2 . 1 ) 


sup- - < ||/||l^i(^) < 6 sup-^ 

P>1 p ^ ’ P>1 p 


( 2 . 2 ) 


Given a measurable subset A of M with p{A) > 0, we consider the normalized restricted 
measure pA on M, i.e.. 

Our basic tool leading to Theorem ll.il will be the following assertion. 









Proposition 2.2. For any measurable function f on M, 


II/IIl^.(^.) < 4e logi/^ (^) ||/||z,^.(^). (2.3) 

Proof. Assume that ||/||l</> 2(^) = 1 and fix p > 1. By the left inequality in (j2.1L for any 

9 > 1 , 


<7^/2 > 


\f\‘>df,>fi{A) 


Ifl'^dfiA, 


so 

II/IIl^(m.) ^ ( 1 

Vd VM(^)y 

But by the right inequality in (12.1|) . 


II/IU 2 < 4sup^ < 

q>l v9 q>p v9 

Applying it on the space (M, ^la), we then get 

II ni ^ A I — 11 y* 11 ) 

||/||l'^ 2(^^) < 4^ sup- - - 

<?>p V 9 

The obtained inequality, 

holds true for any p > 1 and therefore may be optimized over p. Choosing p = log ^7^4) j we 
arrive at (lO) . □ 

A possible weak point in the bound (12.31) is that the means of / are not involved. For 
example, in applications, if / were defined only on A and had pyi-mean zero, we might need 
to find an extension of / to the whole space M keeping the mean zero with respect to p. In 
fact, this should not create any difficulty, since one may work with the symmetrization of /. 

More precisely, we may apply Proposition 12.21 on the product space (M x M,p® p) to 
the product sets Ax A and functions of the form f{x) — f{y). Then we get 



ll/(a^) - /(j/)IIl'/-2(^^®;xa) ^ 4e log^/^ ~ ' 

Since log (77^) < 2 log (77^), we arrive at: 

Corollary 2.3. For any measurable function f on M, 

\\f{x) - f{y)\\ 4e\/2 log^/^ “ fiy)\\L'^ 

Let us now derive an analog of Proposition [52] for the "^i-norm, using similar arguments. 
Assume that ||/||l</'i(^) = 1 and fix p > 1. By the left inequality in (12.21) . for any q > 1, 
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so 




< 


1 \l/9 


(m(a)) 


But, by the inequality (12.21) . 


< 6 sup ■ 


q>l q 

Applying it on the space {M,fiA), we get 
Wfh'i'iit.A) ^ 6p sup 


< 6p sup ■ 


q>p Q 


< 


q>p Q 


The obtained inequality, 


/ 1 \i/p 


holds true for any p > 1 and therefore may be optimized over p. Choosing p = log , we 
arrive at: 


Proposition 2.4. For any measurable function f on M, we have 

Similarly to Corollary 12.31 one may write down this relation on the product probability 
space {M x p) with the functions of the form /(x, y) = /(x) — f{y) and the product 

sets A = A X A. Then we get 

\\f{x) - f{y)\\L^iipA®^A) ^ 12e log Wfix) - (2.4) 


3 Proof of Theorem I I.IL Transport-entropy formulation 

The finiteness of the subgaussian constant for a given metric probability space (M, d, p) 
means that '!/)2-iiorms of Lipschitz functions on M with mean zero are uniformly bounded. 
Equivalently, for any (for all) xq G M, we have that, for some A > 0, 

J dp{x) < OO. 

The dehnition (EB of cr^ip) inspires to consider another norm-like quantity 

(7? = sup 
t/o 

Here is a well-known relation (with explicit numerical constants) which holds in the setting 
of an abstract probability space {M,p). Once again, we include a proof in the appendix for 
completeness. 

Lemma 3.1. If f has mean zero and finite 'if 2 -norm, then 


t2/2 


log 


Af 


dp 
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One can now relate the subgaussian constant of the restricted measure to the subgaussian 
constant of the original measure. Let now (M, d, fi) be a metric probability space. First, 
Lemma l3 .1 1 immediately yields an equivalent description in terms of "02-norms, namely 

^ sup 11/11^^ < < 4 sup WfWl^, (3.1) 

V6f f 

where the supremum is running over all / : M —>■ R with /i-mean zero and ||/||Lip < 1- Here, 
one can get rid of the mean zero assumption by considering functions of the form f{x) — f{y) 
on the product space (M x y). If / has mean zero, then, by Jensen’s inequality, 

If dfj,{x) dfi{y) > J dfj,{x), 

which implies that 

On the other hand, by the triangle inequality, 

Wfix) - /(2/)IIl-^2(^®^) < 2 ||/|L>A2(^)- 

Hence, we arrive at another, more flexible relation, where the mean zero assumption may be 
removed. 

Lemma 3.2. IFe have 

^ sup ||/(a;) - < 4 sup ||/(a;) - 

where the supremum is running over all functions f on M with ||/||Lip ^ 1- 

Proof of Theorem II. Jl We are prepared to make last steps for the proof of the inequality 
dni). We use the well-known Kirszbraun’s theorem: Any function / : A —> R with Lipschitz 
semi-norm ||/||Lip < 1 on A admits a Lipschitz extension to the whole space (^, [MSj l. 
Namely, one may put 

/(a;) = inf r/(a) -I- d(a, x)], x G M. 

aeA 

Applying first Corollary 12.31 and then the left inequality of Lemma [3.21 to /, we get 

11 /(2’)= ll/(2^)-/(y)|ll^2(^^®^^) 

< (4eV2)" log (^) \\f{x) - /(2/)||l*,(^^^) 

< (46^2)" log ■ (4^6)" 

Another application of Lemma 15^ - in the space (A, d, pla) (now the right inequality) yields 

< 4- (4eV2)^ ' (4v^)^cr^(M)- 

This is exactly (11.41) with constant c = 4 • (4e\/2)^ (4\/6)^ = 3 • 2^^e^ = 90796.72... □ 

Remark 3.3. Let us also record the following natural generalization of Theorem M.ll which 
is obtained along the same arguments. Given a collection T of (integrable) functions on the 
probability space (M,/i), define as the infimum over all such that 


dp. < for all t S R, 
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for any f G J^, where m = J f dye. Then with the same constant c as in Theorem \1.11 for 
any measurable A C M, > 0, we have 

where Ta denotes the collection of restrictions of functions f from T to the set A. 

Let us now mention an interesting connection of the subgaussian constants with the 
Kantorovich distances 

Wi{yL,v)=\ai JJ d{x,y) n{x,y) 

and the relative entropies 

(called also Kullback-Leibler’s distances or informational divergences). Here, v is & proba¬ 
bility measure on M, which is absolutely continuous with respect to /r (for short, v « /i), 
and the infimum in the definition of Wi is running over all probability measures tt on the 
product space M x M with marginal distributions ^ and ly, i.e., such that 

Tr{B X M) = tt{M x B) = v{B) (Borel B C M). 

As was shown in |B-G] . if {M,d) is a Polish space (complete separable), the subgaussian 
constant cr^ = (/r) may be described as an optimal value in the transport-entropy inequality 

Wi ifi, v) < (3.2) 

Hence, we obtain from the inequality (II.4p a similar relation for measures v supported on 
given subsets of M. 

Corollary 3.4. Given a Borel probability measure p on a Polish space (M, d) and a closed 
set A in M such that p{A) > 0, for any Borel probability measure v supported on A, 

< ccr^(/r)log Diiy\\pA), 

where c is an absolute constant. 

This assertion is actually equivalent to Theorem 11.11 Note that, for ly supported on A, 
there is an identity D{iy\\pA) = \ogp{A) + D{iy\\p). In particular, D{iy\\pA) < D{iy\\p), so 
the relative entropies decrease when turning to restricted measures. 

4 Proof of Theorem II.3L Spectral gap 

Theorem o insures, in particular, that, for any function / on the metric probability space 
{M,d,p) with Lipschitz semi-norm ||/||Lip < 1, 

< c log 

up to some absolute constant c. In fact, in order to reach a similar concentration property 
of the restricted measures, it is enough to start with a Poincare-type inequality on M, 


AiVar^(/) < 


\Vf\^dp. 










Under this hypothesis, a well-known theorem due to Gromov-Milman and Borovkov-Utev 
asserts that mean zero Lipschitz functions / have bounded "^i-norm. One may use a variant 
of this theorem proposed by Aida and Strook [A-S] , who showed that 


9 


/ 


gUM/ 


dn<Ko = 1.720102... 


(ll/l|Lip<l). 


Hence 

y < 2 A:o and J < ^/^ < 2, 

thus implying that ||/||^i < la addition, 

J d^{x)d^j,{y) < ATq, J dy,{x)dy,{y) < 2 Kq < 6. 

From this, 

J e^v^l/lU-ZO)! dy{x)dy{y) < 6^/^ < 2, 

which means that ||/(a;) — /(j/)||^i < with respect to the product measure y® y on the 
product space M x M. This inequality is translation invariant, so the mean zero assumption 
may be removed. Thus, we arrive at: 

Lemma 4.1. Under the Poincare-type inequality with spectral gap Ai > 0, for any mean 
zero function f on {M,d,y) with ||/||Lip < 1, 

Wfh. < 


Moreover, for any f with ||/||Lip < Ij 

ll/(a;) - f{y)\\L'p ^ (4-1) 

This is a version of the concentration of measure phenomenon (with exponential integra- 
bility) in presence of a Poincare-type inequality. Our goal is therefore to extend this property 
to the normalized restricted measures pA- This can be achieved by virtue of the inequality 
(12.41) which when combined with (14.11) yields an upper bound 

11/(2:) - f(.y)\\L*^(^.A®UA) < 36e log 

Moreover, if / has /r^-mean zero, the left norm dominates ||/||l>/<i(^^) (by Jensen’s inequal¬ 
ity). We can summarize, taking into account once again Kirszbraun’s theorem, as we did in 
the proof of Theorem 11.11 

Proposition 4.2. Assume the metric probability space (M,d,y) satisfies a Poincare-type 
inequality with constant Ai > 0. Given a measurable set A C M with y{A) > 0, for any 
function / : A —>■ K. with pA-ixiean zero and such that ||/||Lip < 1 on A, 

ll/ll«(„., <36elog(^)^. 

Theorem 01 is now easily obtained with constant c = 2 (36e)^ by noting that L^-norms 
are dominated by L’^i-norms. More precisely, since — 1 > one has ||/||^^ > 5 H/Hi- 
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5 Examples 

Theorems o and o involve a lot of interesting examples. Here are a few obvious cases. 

1) The standard Gaussian measure ^ = 7 on K” satisfies a logarithmic Sobolev inequality 
on M = M" with a dimension-free constant p = 1. Hence, from Theorem ll.il we get: 

Corollary 5.1. For any measurable set A C K” with ^{A) > 0, the subgaussian constant 
(T^( 7 yi) of the normalized restricted measure ja satisfies 



where c is an absolute constant. 

As it was already mentioned, if A is convex, there is a sharper bound ( 7 ^( 7 ^) < 1. 
However, it may not hold without convexity assumption. Neverteless, if 7 (A) is bounded 
away from zero, we obtain a more universal principle. 

Clearly, Corollary 15.II extends to all product measures p = z/" on R" such that iz satisfies 
a logarithmic Sobolev inequality on the real line, and with constants c depending on p, only. 
A characterization of the property p > 0 in terms of the distribution function of the measure 
V and the density of its absolutely continuous component may be found in |B-G) . 

2) Consider a uniform distribution 1 / on the shell 

Ae = {x G R" : 1 — e < |x| < 1 }, 0 < e < 1 (n > 2). 

Corollary 5.2. The subgaussian constant of v satisfies up to some absolute 

constant c. 

In other words, mean zero Lipschitz functions / on A^ are such that ■\/n f are subgaussian. 
This property is well-known in the extreme cases — on the unit Euclidean ball A = Bn {e = 1) 
and on the unit sphere A = (e = 0). 

Let p, denote the normalized Lebesgue measure on i?„. In the case e > -, the shell A^ 
represents the part of Bn of measure 



Since the logarithmic Sobolev constant of the unit ball is of order i, and therefore cr^(p) < 
the assertion of Corollary 15.21 immediately follows from Theorem ll.il If e < i, the assertion 
follows from a similar concentration property of the uniform distribution on the unit sphere. 
Indeed, with every Lipschitz function / on A^ one may associate its restriction to 5"“^, 
which is also Lipschitz (with respect to the Euclidean distance). On the other hand, for any 
r G [1 — e, 1] and 9 G we have \f{r9) — f{9)\ < |r—1| <e< thus proving the claim. 

3) The two-sided product exponential measure p on R" with density 

satisfies a Poincare-type inequality on M = R" with a dimension-free constant Ai = 1/4. 

Hence, from Proposition 14.21 we get: 

Corollary 5.3. For any measurable set A C R” with p(A) > 0, and for any function 
/ : A —R with pA-mean zero and ||/||Lip < 1, we have 



where c is an absolute constant. In particular, 
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Clearly, Corollary 6.3 extends to all product measures ^ on R" such that v satisfies 
a Poincare-type inequality on the real line, and with constants c depending on Ai, only. 
A characterization of the property Ai > 0 may also be given in terms of the distribution 
function of v and the density of its absolutely continuous component (cf. mi)- 

4a) Let us take the metric probability space ({0,1}", /r), where dn is the Hamming 

distance, that is, dn{x,y) = ^{i : Xi ^ yi], equipped with the uniform measure y,. For this 
particular space, Marton established the transport-entropy inequality (13.21) with an optimal 
constant cr^ = ^, cf. [Mar] . Using the relation (13.21) as an equivalent definition of the 
subgaussian constant, we obtain from Theorem ll.il 

Corollary 5.4. For any non-empty set A C {0,1}", the subgaussian constant a‘^{yA) of the 
normalized restricted measure yA satisfies, up to an absolute constant c, 

{y a) < cn log (^-^y (5.1) 


4b) Let us now assume that A is monotone, i.e., A satisfies the condition 

{xi,...,Xn) € A (yi,... ,y„) e A, whenever > Xi, i = 1,... ,n. 

Recall that the discrete cube can be equipped with a natural graph structure: there is an 
edge between x and y whenever they are of Hamming distance dn{x,y) = 1. For monotone 
sets A, the graph metric dA on the subgraph on A is equal to the restriction of dn to Ax A. 
Indeed, we have: 


dn{x, y) < dA{x, y) < dA{x, x hy) + dA{y,x hy) = dn{x, x A y) dn{y,x A y) = dn{x, y), 
where x Ay = (xi A yi,..., x„ A yn)- Thus, 


s^{yA,dA) < a^{yA,dA) < cnlog 



This can be compared with what follows from a recent result of Ding and Mossel (see |D-M] 1. 
The authors proved that the conductance (Cheeger constant) of {A,yA) satisfies ^{A) > 
However, this type of isoperimetric results may not imply sharp concentration bounds. 
Indeed, by using Cheeger inequality, the above inequality leads to Ai > cy(A)^/n^ and 
s^{yA,dA) < 1/Ai < cn^/y{AY, which is even worse than the trivial estimate s^{yA,dA) < 
Idiam(A)^ < n^/2. 

5) Let (M, d, y) be a (separable) metric probability space with finite subgaussian constant 
cr^(y). The previous example can be naturally generalized to the product space (M",^"), 
when it is equipped with the £^-type metric 


n 

dn{x,y) = ^d{xi,yi), x = (xi,..., x„), y = (yi,..., y„) G M”. 
i=l 


This can be done with the help of the following elementary observation. 

Proposition 5.5. The subgaussian constant of the space {M"^,dn, yA) is related to the sub¬ 
gaussian constant of (M,d,y) by the equality a‘^(y^) = na‘^(y). 

Indeed, one may argue by induction on n. Let / be a function on M”. The Lipschitz 
property ||/||Lip < 1 with respect to dn is equivalent to the assertion that / is coordinatewise 
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Lipschitz, that is, any function of the form Xi —>■ f{x) has a Lipschitz semi-norm < 1 on M 
for all fixed coordinates Xj G M {j ^ i). Hence, in this case, for all t S K, 

t 2+2 . 


Jfi^) 


' M 


( f 'I 

d/i(a;„) < exp t / f{x) dn{xn) H- — 1, 

Jm ^ ’ 


where = cr^(/r). Here the function (xi,..., Xn-i) —>■ f(x) d^{Xn) is also coordinatewise 
Lipschitz. Integrating the above inequality with respect to ..., Xn-i) and applying 

the induction hypothesis, we thus get 

[ < exp|t / f {x) d^j,^{x) + n . 

Jm" '• Jm" 2 J 

But this means that (T^(/r") < 

For an opposite bound, it is sufficient to test (II.1|) for {M",dn,ix'^) in the class of all 
coordinatewise Lipschitz functions of the form /(x) = u(xi) -I- • • • -I- u{xn) with /i-mean zero 
functions u on M such that ||it||Lip < 1- 

Corollary 5.6. For any Borel set A C M" such that > 0, the subgaussian constant 

of the normalized restricted measure with respect to the -type metric dn satisfies 


\pJX)<cna\p) log(-^). 


where c is an absolute constant. 

For example, if p, is a probability measure on M = K such that dp{x) < 2 

(A > 0), then for the restricted product measures we have 


.^^ 3 ) < o„AMog(^) 


with respect to the £^-norm ||x||i = |xi| H--|- |x„| on R". 

Indeed, by the integral hypothesis on p, for any / on R with ||/||Lip < 1, 


(5.2) 


/ CXD pOO poo pOO 

/ dp{x)dp{y) < / 

-oo J —oo J —oo J — 


Px-y)^/2X^ 


dp{x)dp{y) 


— OO J —OO 
OO /*00 


< 


/ OO pOO 

/ dp{x)dp(y) < 4. 

-OO J —oo 


Hence, if / has /i-mean zero, by Jensen’s inequality. 


/ OO pOO pOO pOO 

/ dp{x) < / 1^^ dp{x)dp{y) < 2, 

-OO J —oo J —oo J —oo 

meaning that ||/||lV' 2 (;_j) < 2A. By Lemma [3.11 cf. m, it follows that cr^(/i) < 16A^, so, 
(15.21) holds true by an application of Corollarv l5.6l 


6 Deviations for non-Lipschitz functions 

Let us now turn to the interesting question on the relationship between the distribution of a 
locally Lipschitz function and the distribution of its modulus of the gradient. We still keep 
the setting of a metric probability space (M, d, p) and assume it has a finite subgaussian 
constant = cr^(/r) (ct > 0 ). 

Let us say that a continuous function / on M is locally Lipschitz, if |V/(x)| is finite for 
all X G M. Recall that we consider the sets 

H= {x G M : |V/(x)| < L}, L > 0. (6.1) 

First we state a more general version of Corollary 11.21 
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Theorem 6.1. Assume that a locally Lipschitz function f on M has Lipschitz semi-norms 
< L on the sets of the form (16.ip . If yt{\S/f\ > Lq} < 1, then for all t > 0, 


iti<S) ti){\fix) - f{y)\ > t} 


< 


2 inf 
L>Lq 




■m{|V/| >L} 


( 6 . 2 ) 


where c is an absolute constant. 


Proof. Although the argument is already mentioned in Section [1] let us replace (11.61) with a 
slightly different bound. Applying Theorem ll.il the definition (jl.ll) yields 

JI' e*(/(^)-/(!/)) dnA{x)dpA{y) < for all t G R, 

where A is defined in eu with L > Lq, and where c is universal constant. From this, for 
any t > 0 , 

(p.A'^ tiA){ix,y) G Ax A: \ f{x) - f{y)\ >t}< 


and therefore 

(m 0 p) {{x,y) gAxA: \fix) - fiy)\ >t}< 

The product measure of the complement of A x A does not exceed 2/r{|V/(x)| > L}, and 
we obtain ( 1 ^ . □ 

If J d/r < 2, we have, by Chebyshev’s inequality, /r{|V/| > L} < 2e“^^, so one 
may take Lq = -^/log 4. Theorem 16.11 then gives that, for any > log 4, 

{ti^Ii){\f{x)-f{y)\>t} < 2e-‘'/=‘^'^'+4e-^^ 

For t > 2a one may choose here leading to 

iti'S) ii){\f{x) - f{y)\ >t} < 6 

for some absolute constant c > 1 . In case 0 < t < 2a, this inequality is fulfilled automatically, 
so it holds for alH > 0. As a result, with some absolute constant C, 

ll/(a;) - /(y)IUi < C'o-, 

which is an equivalent way to state the inequality of Corollary 11.21 

As we have already mentioned, with the same arguments inequalities like (16.21) can be 
derived on the basis of subgaussian constants defined for different classes of functions. For 
example, one may consider the subgaussian constant a'^{fi) for the class J' of all convex 
Lipschitz functions / on the Euclidean space M = M" (which we equip with the Euclidean 
distance). Note that |V/(a;)| is everywhere finite in the n-space, when / is convex. Keeping 
in mind Remark 13.31 what we need is the following analog of Kirszbraun’s theorem: 

Lemma 6.2. Let f be a convex function on R”. For any L > 0, there exists a convex 
function g on R” such that f = g on the set A = {x : |V/(x)| < L} and < L on R”. 

Accepting for a moment this lemma without proof, we get: 

Theorem 6.3. Assume that a convex function f on R" satisfies /r{|V/| > Lq} < i. Then 
for all t > 0, 

e-tVcPL- L}], 

where a'^ = a^(g) and c is an absolute constant. 


(m® M){|/(a:) - /(y)| > t} < 2 inf 
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For illustration, let /r = /ri 0 • • • (8) /i„ be an arbitrary product probability measure on the 
cube [—1,1]". If / is convex and Lipschitz on R", thus with |V/| < 1, then 

0 /r){|/(x) - /(y)| >t} < 2e-*'/'=. (6.3) 

This is one of the forms of Talagrand’s concentration phenomenon for the family of convex 
sets/functions (cf. [T11IT2) . [M] . (Q). That is, the subgaussian constants are bounded 

for the class J- of convex Lipschitz / and product measures /r on the cube. Hence, using 
Theorem 16.31 Talagrand’s deviation inequality (16.3|) admits a natural extension to the class 
of non-Lipschitz convex functions: 

Corollary 6.4. Let ^ be a product probability measure on the cube, and let f be a convex 
function on R". If /i{|V/| > Lq} < 1, then for all t > 0, 

e-‘VcL^+^{|V/| >L}], 

where c is an absolute constant. 

In particular, we have a statement similar to Corollary 1 1.21 - for this family of functions, 
namely 

||/-m||i^i(^) < c\\Vf\\L,p2(p), 

where m is the /i-mean of /. 

Proof of Lemma lKR An affine function la,v{x) = a + {x,v) {v S R", a G R) may be called 
to be a tangent function to /, if / > ^ on R" and f{x) = la,v(x) for at least one point x. It 
is well-known that 

fix') — SUpl/a^^)^) . la.v ^ 

where £ denotes the collection of all tangent functions la,v Put, 

g{x) = sup{;a,«(a;) : la,v £ £, |u| < £}. 

By the construction, g < / on R" and, moreover, 

llnllLip — SUpl II .y IILip . la,v £ £, |u| ^ £} 

= sup{|u| : la,v £ £, |u| <£}<£. 


p){\f{x) - f{y)\ >t}<2 inf 


It remains to show that g = f on the set A = {|V/| < £}. Let x G A and let la,v be tangent 
to / and such that la,v{x) = f{x). This implies that f{y) — f{x) > {y — x,v) for all j/ G R" 
and hence 


|V/(cr)| 


lim sup 

y^x 


\fiy) - f{x)\ 

\y-x\ 


> lim sup 

y^x 


{y-x,v) 

|y-a;| 


V. 


Thus, |z;| < L, so that g{x) > la,v{x) = f{x). 


□ 


7 Optimality 

Here we show that the logarithmic dependence in p,{A) in Theorems II.II and II.31 is optimal, 
up to the universal constant c. We provide several examples. 

Example 1. Let us return to Example 4), Section[5l of the hypercube M = {0, 1}", which we 
equip with the Hamming distance dn and the uniform measure p. Let us test the inequality 
ISH) of Corollary 15.41 on the set A C {—1,1}" consisting of n -I- 1 points 


( 0 , 0 , 0 ,..., 0 ), ( 1 , 0 , 0 ,..., 0 ), ( 1 , 1 , 0 ,..., 0 ), ..., ( 1 , 1 , 1 ,...,!). 
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We have n{A) = (n + l)/2” > 1/2”. The function f : A ^ M., defined by 

fix) = i{i : = 1 } - 

has a Lipschitz semi-norm ||/||Lip < 1 with respect to d and the /x^-mean zero. Moreover, 
/ P d^A = ■ Expanding the inequality f d ^a < origin yields 

f P d^A < Pif^A)- Hence, recalling that Pip < j, we get 

> j fd^A > ^ 

This example shows the optimality of in the regime pA) -A 0 . 

Example 2. Let 7 „ be the standard Gaussian measure on R" of dimension n > 2. We have 
Piln) = 1- Consider the normalized measure ^Ar on the set 

Hjj = {(a;i,a; 2 ,. ■. ,a;n) G K” : x\+x\>R^P R>0. 

Using the property that the function 1 {xf -I- 2 ;|) has a standard exponential distribution 
under the measure 7 ^, we find that "fniAp = e~^ Moreover, 

s^Par) > Var^^^(a;i) = J xldjARp) = \j Pi + xl) d-iARix) 

Therefore, 

A(ia.) > .’(7,4.) > I»8(7yi7). 

showing that the inequality (d of Theorem o is optimal, up to the universal constant, 
for any value of 'yPA) S [ 0 , 1 ]. 

Example 3. A similar conclusion can be made about the uniform probability measure /x on 
the Euclidean ball i?(0, pn) of radius pn^ centred at the origin (asymptotically for growing 
dimension n). To see this, it is sufficient to consider the cylinders 

Ae = {ixi,y) G M X ]R"“^ : jccil < pn — e'^ and \y\ < e}, 0 < £ < pn, 

and the function f{x) = xi. We leave to the readers corresponding computations. 

Example 4. Let y be the two-sided exponential measure on R with density i In this 

case Pip = 00 , but, as easy to see, 2 < s^{p < 4 (recall that Ai(/i) = |). We are going to 
test optimality of the inequality d on the sets A^ = {x G R : |a:| > i?} (i? > 0). Clearly, 
pAp = e“^, and we find that 

/ OO ^ pOO 

x^dyARp) = Pe~^dr 

= R^ + 2R + 2>(B+7 = log=(^). 

Therefore, 

»W)>i<.g“ (^). 

showing that the inequality d is optimal, up to the universal constant, for any value of 
/x(A)g (0,1]. 
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Appendix 

Proof of Lemma \2.1[ Using the homogeneity, in order to derive the right-hand side inequality 
in we may assume that supp>]^ < 1. Then J \ f\P dfi < for all p > 1, and by 
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Chebyshev’s inequality, 


1 - 


Fit) = m{|/I >t}< for all t > 0 . 


If t > 2, choose here p = \ which case 1 — F(t) < 2 4 ‘ . Integrating by parts, we have, 
for any 0 < e < l^f^, 


dp = — 


e"* d(l-F{t)) 
(•2 


pZ pOQ 

= l + 2e (1 — F(t)) dt + 2e / te®* {1 — F{t))dt 

Jo J2 

,2 ^2 log 2 .2 

< 1 + 2e / te * dt + 2e / te ‘ e * dt 


„4£ 


log2 


-(log2-4e) _ g4e 1 


— e 


2 (l 2 f-£) 


If £ < 1^|-^, the latter expression does not exceed | which does not exceed 2 for £ < 
iog( 4 / 3 ) ^ inequalities are fulfilled for e = 1^^, and with this value f dp < 2. Hence 
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Vs V log 2 


<4, 


which yields the right inequality in (12.11) . Conversely, if ||/||l<^ 2 (^) = 1, then J dp = 2. 
Since u{t) = e“‘ is maximized in t > 0 at tp = we get 


= J u{f)s^ dp < u{to) - 2 = 2 ■ 


Hence, < 1, which yields the left inequality. 

Now, let us turn to (12.21) and assume that supp>;^ = 1. Then J \f\P dp < pP for all 
p > 1, and by Chebyshev’s inequality, for all t > 0, 


1 - 


F{t)=p{\f\>t}<(^^y. 


If t > 2 , we may choose here p = ^ t in which case 1 — F{t) <2 2 while for 1 < t < 2 we 
choose p = 1, so that 1 — F{t) < i. Arguing as before, we have, for any 0 < £ < l^f^. 


e^l^l dp = 


pi p2 poo 

1 + £ / (1 - F{t)) dt + e (1 - F{t)) dt + e (1 - F{t)) 

Jo Ji J2 


dt 


r 2 g£t 


< 1 + £ / e®* dt + £ / — dt + el e®* e * * dt. 




The pre-last integral can be bounded by ^ dt = log 2, so 


dp < -I-£e^^ log 2-|- 


1 t 

2s 1 


log 2 




— £ 


For £ = the latter expression is equal to 1.98903902..., and thus f dp < 2. Hence 
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Conversely, if |1 /||l<Ai(^) = 1, then f el-tl = 2. Since u{t) = e ‘is maximized at to = p, 
we get 

ll/llp = J dp < u{to) -2 = 2 . 

Hence, which yields the left inequality. 


□ 


Proof of Lemma Uni First assume that ||/||^2 = 1, i.e., f dp = 2. The function 

u{t) = log J dp 

is smooth, convex, with u( 0 ) = 0 and 

Jfe^fdp 

Je^fdp- 

In particular, ■ii'(O) = 0. Note that, by Jensen’s inequality, / dp > 1, so u{t) > 0. Further 
differentiation gives 

f dp — ( f fe*ddpP f „ 

u"(t) = ^^ < / fe^f dp. 

(Je^fdpf -J' 

Using tf < ‘ and the elementary inequality < 2e“^, we get, for |t| < 1, 

I p dp < [ p e"~P~ dp 


= e‘ 


J p ed dp < e‘ 2 e ^ J dp < 4. 


Thus, u”{t) < 4, and by Taylor’s formula, u{t) < 2t^. 
On the hand, for |t| > 1 , by Cauchy’s inequality. 


e‘^ dp < 




dp = e 


t ^/2 / J ^/2 


dp 


, ........ 

Hence, in this case u{t) < <t^. Thus, 

9 

proving the right inequality of Lemma 13.11 

For the left inequality, let a'j = 1. Then J dp < e‘ for all t S R, which implies 

1-F(t) = ^{|/| >^}<2e-‘'/^ t>0. 

Form this, integrating by parts, we have, for any 0 < £ < 1, 




pOO 

= / e"‘ 

' dF{t) 

Jo 

r°° 

= 1 + 2 e 

/ 

J 

'0 


/•oo 

< 1 + 4^ 

/ 

J 

'0 




2 e 


— £ 


The last expression is equal to 2 for e = 1, which means that ||/||^2 < pG. 


□ 










